251 research outputs found
Deep supervised learning using local errors
Error backpropagation is a highly effective mechanism for learning
high-quality hierarchical features in deep networks. Updating the features or
weights in one layer, however, requires waiting for the propagation of error
signals from higher layers. Learning using delayed and non-local errors makes
it hard to reconcile backpropagation with the learning mechanisms observed in
biological neural networks as it requires the neurons to maintain a memory of
the input long enough until the higher-layer errors arrive. In this paper, we
propose an alternative learning mechanism where errors are generated locally in
each layer using fixed, random auxiliary classifiers. Lower layers could thus
be trained independently of higher layers and training could either proceed
layer by layer, or simultaneously in all layers using local error information.
We address biological plausibility concerns such as weight symmetry
requirements and show that the proposed learning mechanism based on fixed,
broad, and random tuning of each neuron to the classification categories
outperforms the biologically-motivated feedback alignment learning technique on
the MNIST, CIFAR10, and SVHN datasets, approaching the performance of standard
backpropagation. Our approach highlights a potential biological mechanism for
the supervised, or task-dependent, learning of feature hierarchies. In
addition, we show that it is well suited for learning deep networks in custom
hardware where it can drastically reduce memory traffic and data communication
overheads
A Charge-Based CMOS Parallel Analog Vector Quantizer
We present an analog VLSI chip for parallel analog vector quantization. The MOSIS 2.0 μm double-poly CMOS Tiny chip contains an array of 16 x 16 charge-based distance estimation cells, implementing a mean absolute difference (MAD) metric operating on a 16-input analog
vector field and 16 analog template vectors. The distance cell including dynamic template storage measures 60 x 78 μm^2. Additionally, the chip features a winner-take-all (WTA) output circuit of linear complexity, with global positive feedback for fast and decisive settling of a
single winner output. Experimental results on the complete 16 x 16 VQ system demonstrate correct operation with 34 dB analog input dynamic range and 3 μsec cycle time at 0.7 mW power dissipation
Hardware-efficient on-line learning through pipelined truncated-error backpropagation in binary-state networks
Artificial neural networks (ANNs) trained using backpropagation are powerful
learning architectures that have achieved state-of-the-art performance in
various benchmarks. Significant effort has been devoted to developing custom
silicon devices to accelerate inference in ANNs. Accelerating the training
phase, however, has attracted relatively little attention. In this paper, we
describe a hardware-efficient on-line learning technique for feedforward
multi-layer ANNs that is based on pipelined backpropagation. Learning is
performed in parallel with inference in the forward pass, removing the need for
an explicit backward pass and requiring no extra weight lookup. By using binary
state variables in the feedforward network and ternary errors in
truncated-error backpropagation, the need for any multiplications in the
forward and backward passes is removed, and memory requirements for the
pipelining are drastically reduced. Further reduction in addition operations
owing to the sparsity in the forward neural and backpropagating error signal
paths contributes to highly efficient hardware implementation. For
proof-of-concept validation, we demonstrate on-line learning of MNIST
handwritten digit classification on a Spartan 6 FPGA interfacing with an
external 1Gb DDR2 DRAM, that shows small degradation in test error performance
compared to an equivalently sized binary ANN trained off-line using standard
back-propagation and exact errors. Our results highlight an attractive synergy
between pipelined backpropagation and binary-state networks in substantially
reducing computation and memory requirements, making pipelined on-line learning
practical in deep networks.Comment: Now also consider 0/1 binary activations. Memory access statistics
reporte
Event-Driven Contrastive Divergence for Spiking Neuromorphic Systems
Restricted Boltzmann Machines (RBMs) and Deep Belief Networks have been
demonstrated to perform efficiently in a variety of applications, such as
dimensionality reduction, feature learning, and classification. Their
implementation on neuromorphic hardware platforms emulating large-scale
networks of spiking neurons can have significant advantages from the
perspectives of scalability, power dissipation and real-time interfacing with
the environment. However the traditional RBM architecture and the commonly used
training algorithm known as Contrastive Divergence (CD) are based on discrete
updates and exact arithmetics which do not directly map onto a dynamical neural
substrate. Here, we present an event-driven variation of CD to train a RBM
constructed with Integrate & Fire (I&F) neurons, that is constrained by the
limitations of existing and near future neuromorphic hardware platforms. Our
strategy is based on neural sampling, which allows us to synthesize a spiking
neural network that samples from a target Boltzmann distribution. The recurrent
activity of the network replaces the discrete steps of the CD algorithm, while
Spike Time Dependent Plasticity (STDP) carries out the weight updates in an
online, asynchronous fashion. We demonstrate our approach by training an RBM
composed of leaky I&F neurons with STDP synapses to learn a generative model of
the MNIST hand-written digit dataset, and by testing it in recognition,
generation and cue integration tasks. Our results contribute to a machine
learning-driven approach for synthesizing networks of spiking neurons capable
of carrying out practical, high-level functionality.Comment: (Under review
Recommended from our members
EEG-Based Quantification of Cortical Current Density and Dynamic Causal Connectivity Generalized across Subjects Performing BCI-Monitored Cognitive Tasks.
Quantification of dynamic causal interactions among brain regions constitutes an important component of conducting research and developing applications in experimental and translational neuroscience. Furthermore, cortical networks with dynamic causal connectivity in brain-computer interface (BCI) applications offer a more comprehensive view of brain states implicated in behavior than do individual brain regions. However, models of cortical network dynamics are difficult to generalize across subjects because current electroencephalography (EEG) signal analysis techniques are limited in their ability to reliably localize sources across subjects. We propose an algorithmic and computational framework for identifying cortical networks across subjects in which dynamic causal connectivity is modeled among user-selected cortical regions of interest (ROIs). We demonstrate the strength of the proposed framework using a "reach/saccade to spatial target" cognitive task performed by 10 right-handed individuals. Modeling of causal cortical interactions was accomplished through measurement of cortical activity using (EEG), application of independent component clustering to identify cortical ROIs as network nodes, estimation of cortical current density using cortically constrained low resolution electromagnetic brain tomography (cLORETA), multivariate autoregressive (MVAR) modeling of representative cortical activity signals from each ROI, and quantification of the dynamic causal interaction among the identified ROIs using the Short-time direct Directed Transfer function (SdDTF). The resulting cortical network and the computed causal dynamics among its nodes exhibited physiologically plausible behavior, consistent with past results reported in the literature. This physiological plausibility of the results strengthens the framework's applicability in reliably capturing complex brain functionality, which is required by applications, such as diagnostics and BCI
Learning Non-deterministic Representations with Energy-based Ensembles
The goal of a generative model is to capture the distribution underlying the
data, typically through latent variables. After training, these variables are
often used as a new representation, more effective than the original features
in a variety of learning tasks. However, the representations constructed by
contemporary generative models are usually point-wise deterministic mappings
from the original feature space. Thus, even with representations robust to
class-specific transformations, statistically driven models trained on them
would not be able to generalize when the labeled data is scarce. Inspired by
the stochasticity of the synaptic connections in the brain, we introduce
Energy-based Stochastic Ensembles. These ensembles can learn non-deterministic
representations, i.e., mappings from the feature space to a family of
distributions in the latent space. These mappings are encoded in a distribution
over a (possibly infinite) collection of models. By conditionally sampling
models from the ensemble, we obtain multiple representations for every input
example and effectively augment the data. We propose an algorithm similar to
contrastive divergence for training restricted Boltzmann stochastic ensembles.
Finally, we demonstrate the concept of the stochastic representations on a
synthetic dataset as well as test them in the one-shot learning scenario on
MNIST.Comment: 9 pages, 3 figures, ICLR-15 workshop contributio
- …